6 research outputs found
Algorithms and lower bounds for testing properties of structured distributions
In this doctoral thesis we consider various property testing problems for structured
distributions. A distribution is said to be structured if it belongs to a certain class
which can be simply described in approximation terms. Such distributions often arise
in practice, e.g. log-concave distributions, easily approximated by polynomials (see
[Bir87a]), often appear in econometric research. For structured distributions, testing a
property often requires far less samples than for general unrestricted distributions.
In this thesis we prove that this is indeed the case for several distance-related properties.
Namely, we give explicit sub-linear time algorithms for L1 and L2 distance
testing between two structured distributions for the cases when either one or both of
them are available as a “black box”.
We also prove that the given algorithms have the best possible asymptotic complexity
by proving matching lower bounds in the form of explicit problem instances (albeit
constructed using randomized techniques) demanding at least a specified amount of
data to be tested successfully.
As the main numerical result, we prove that testing that total variation distance
to an explicitly given distribution is at least e requires O(√k/e²) samples, where k is an approximation parameter, dependent on the class of distribution being tested and
independent of the support size. Testing that the total variation distance between two
“black box” distributions is at least e requires O(k⁴/⁵e⁶/⁵). In some cases, when k ~ n,
this result may be worse than using an unrestricted testing algorithm (which requires
O( n²/3/e² ) samples where n is the domain size). To address this issue, we develop a third
algorithm, which requires O(k²/³e⁴/³ log⁴/³(n/k) log log(n/k)) and serves as a bridge between
the cases of small and large domain sizes
Testing Identity of Structured Distributions
We study the question of identity testing for structured distributions. More
precisely, given samples from a {\em structured} distribution over
and an explicit distribution over , we wish to distinguish whether
versus is at least -far from , in distance. In
this work, we present a unified approach that yields new, simple testers, with
sample complexity that is information-theoretically optimal, for broad classes
of structured distributions, including -flat distributions, -modal
distributions, log-concave distributions, monotone hazard rate (MHR)
distributions, and mixtures thereof.Comment: 21 pages, to appear in SODA'1
Optimal Algorithms and Lower Bounds for Testing Closeness of Structured Distributions
We give a general unified method that can be used for {\em closeness
testing} of a wide range of univariate structured distribution families. More
specifically, we design a sample optimal and computationally efficient
algorithm for testing the equivalence of two unknown (potentially arbitrary)
univariate distributions under the -distance metric: Given
sample access to distributions with density functions ,
we want to distinguish between the cases that and
with probability at least . We show
that for any , the {\em optimal} sample complexity of the
-closeness testing problem is . This is the first
sample algorithm for this problem, and yields new, simple closeness
testers, in most cases with optimal sample complexity, for broad classes of
structured distributions.Comment: 27 pages, to appear in FOCS'1
Near-Optimal Closeness Testing of Discrete Histogram Distributions
We investigate the problem of testing the equivalence between two discrete
histograms. A {\em -histogram} over is a probability distribution that
is piecewise constant over some set of intervals over . Histograms
have been extensively studied in computer science and statistics. Given a set
of samples from two -histogram distributions over , we want to
distinguish (with high probability) between the cases that and
. The main contribution of this paper is a new
algorithm for this testing problem and a nearly matching information-theoretic
lower bound. Specifically, the sample complexity of our algorithm matches our
lower bound up to a logarithmic factor, improving on previous work by
polynomial factors in the relevant parameters. Our algorithmic approach applies
in a more general setting and yields improved sample upper bounds for testing
closeness of other structured distributions as well
Investigation of laser modification and light-induced electric signals in y - ba - cu - o films
The experimental investigation of film electric and optic property variations for high temperature superconductors is the aim of the paper. As a result a model of anisotropic thermo-EMF, induced by the laser radiation has been developed. The determination simple methodology of dielectric permittivity and thickness of films has been suggested. The laser modification of high temperature superconducting films has been investigated, the applicability of the thermal model to it has been established. Non-stationary electric signals in Y - Ba - Cu - O films have been discovered and investigated. Results may find their field of application in optic detectors, moisture-free contact laser lithographyAvailable from VNTIC / VNTIC - Scientific & Technical Information Centre of RussiaSIGLERURussian Federatio